Fujisaki model based F0 contours in vietnamese TTS

نویسندگان

  • Dung Tien Nguyen
  • Chi Mai Luong
  • Bang Kim Vu
  • Hansjörg Mixdorff
  • Huy Hoang Ngo
چکیده

The current paper presents preliminary work towards the integration of the Fujisaki model into the VnVoice Vietnamese TTS system, based on a set of rules to control the F0 contour. A speech corpus consisting of 20 sentences was compiled. Each of the sentences can have various meanings depending on the tone associated with a monosyllabic keyword which it contains. The corpus with a total of 46 sentences was recorded by a female speaker whose voice had also been used in the speech corpus for VnVoice, and labeled at the syllabic level. Tone contrast perception results and naturalness comparisons show that the Fujisaki model works well in modeling F0 contour of Vietnamese tones.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A novel approach to the fully automatic extraction of Fujisaki model parameters

The generation of naturally-sounding F0 contours in TTS is important for the intellegibility and perceived naturalness of synthetic speech. In earlier works the author developed a linguistically motivated model of German intonation based on the quantitative Fujisaki model of the production process of F0. The extraction of parameters for this model from the extracted F0 contour, however, poses p...

متن کامل

Towards the automatic extraction of fujisaki model parameters for Mandarin

The generation of naturally-sounding F0 contours in TTS enhances the intelligibility and perceived naturalness of synthetic speech. In earlier works the first author developed a linguistically motivated model of German intonation based on the quantitative Fujisaki model of the production process of F0, and an automatic procedure for extracting the parameters from the F0 contour which, however, ...

متن کامل

An Overview of Prosodic Modelling for Croatian Speech Synthesis

In order to include prosody into the text to speech (TTS) systems prosody knowledge needs to be acquired, represented and incorporated. Two main features of prosody important for modelling prosody for TTS systems are duration and F0 contour. There are various approaches to modelling those features and they can be categorized into three main groups: rule based, statistical and minimalistic. Some...

متن کامل

A targets-based superpositional model of fundamental frequency contours applied to HMM-based speech synthesis

Superpositional model of fundamental frequency (F0) contours as suggested by the Fujisaki model can well represent F0 movements of speech keeping a clear relation with linguistic information of utterances. Therefore, improvement of HMM-based speech synthesis is expected by using the merit of superpositional model. In this paper, a targets-based superpositional model is proposed in the light of ...

متن کامل

Statistical Approach to Fujisaki-model Parameter Estimation from Speech Signals and Its Quantitative Evaluation

We have previously proposed a statistical model of speech F0 contours, which is based on the discrete-time version of the Fujisaki model. One advantage of this model is that it allows us to introduce statistical methods to learn the Fujisaki-model parameters from speech F0 contours. This paper proposes several modifications to our previous model and parameter inference algorithm, and quantitati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004